Search CORE

2 research outputs found

Recommended from our members

A Machine Learning Approach: Socio-economic Analysis to Support and Identify Resilient Analog Communities in Texas

Author: Mabadeje Ademide O.
Publication venue
Publication date: 26/08/2022
Field of study

Identification of analog resources or items are important during the planning and development of new communities because available information is usually limited or absent. Conventionally, analogs are made by domain experts however, this is not always readily obtainable. Coupled with this challenge, most of the available data in socioeconomic systems have high dimensionality making interpretation, and visualization of these datasets difficult. Hence, it is crucial to adopt a workflow that can be used to identify analogs regardless of its existing high dimensionality. To this end, we present a systematic and unbiased measure, group similarity score (GCS) and similarity scoring metric (SSM) to support the predictive search of missing properties for target communities and identification of analogous cities based on available socioeconomic data and modeling. Knowing that each Texan community can be characterized by its associated properties, the workflow combines both spatial and multivariate statistics in a novel manner to determine the GCS & SSM whilst visualizing the associated uncertainty space. The workflow consists of three major steps: 1) key parameter selection via feature engineering, 2) multivariate and spatial analysis using multidimensional scaling (MDS) and density-based spatial clustering of applications with noise (DBSCAN) for clustering analysis, 3) similarity ranking using a modified Mahalanobis distance function as a clustering basis on preprocessed data. Afterwards, to assess the quality of the predicted feature and analog communities obtained, K-nearest neighbor algorithm is applied, then the analog cities are found. The workflow is demonstrated using on high dimensional socio-economic data. We find analogs for each community cluster identified with their GCS and SSM in relation to 4 randomly selected communities used for testing. Thus, it is recommended to apply the integration of this workflow in uncertainty exploration, trend-mappings, and community analog assignment, and benchmarking to support decision making.IC2 InstitutePetroleum and Geosystems Engineerin

Texas ScholarWorks

Rigid Transformations for Stabilized Lower Dimensional Space to Support Subsurface Uncertainty Quantification and Interpretation

Author: Mabadeje Ademide O.
Pyrcz Michael J.
Publication venue
Publication date: 15/08/2023
Field of study

Subsurface datasets inherently possess big data characteristics such as vast volume, diverse features, and high sampling speeds, further compounded by the curse of dimensionality from various physical, engineering, and geological inputs. Among the existing dimensionality reduction (DR) methods, nonlinear dimensionality reduction (NDR) methods, especially Metric-multidimensional scaling (MDS), are preferred for subsurface datasets due to their inherent complexity. While MDS retains intrinsic data structure and quantifies uncertainty, its limitations include unstabilized unique solutions invariant to Euclidean transformations and an absence of out-of-sample points (OOSP) extension. To enhance subsurface inferential and machine learning workflows, datasets must be transformed into stable, reduced-dimension representations that accommodate OOSP. Our solution employs rigid transformations for a stabilized Euclidean invariant representation for LDS. By computing an MDS input dissimilarity matrix, and applying rigid transformations on multiple realizations, we ensure transformation invariance and integrate OOSP. This process leverages a convex hull algorithm and incorporates loss function and normalized stress for distortion quantification. We validate our approach with synthetic data, varying distance metrics, and real-world wells from the Duvernay Formation. Results confirm our method's efficacy in achieving consistent LDS representations. Furthermore, our proposed "stress ratio" (SR) metric provides insight into uncertainty, beneficial for model adjustments and inferential analysis. Consequently, our workflow promises enhanced repeatability and comparability in NDR for subsurface energy resource engineering and associated big data workflows.Comment: 30 pages, 17 figures, Submitted to Computational Geosciences Journa

arXiv.org e-Print Archive